Facilitating Reconciliation of Inter-Annotator Disagreements
نویسندگان
چکیده
Development and evaluation of Natural Language Processing methods often requires text annotation. To gauge the difficulty of the task and increase the reliability and quality of annotations, researchers often recruit at least two annotators. The discrepancies in annotations by multiple annotators need to be identified and reconciled. We present a tool that identifies and helps reconciling and validating annotations in a widely used annotation tool Brat.
منابع مشابه
Annotators' Certainty and Disagreements in Coreference and Bridging Annotation in Prague Dependency Treebank
In this paper, we present the results of the parallel Czech coreference and bridging annotation in the Prague Dependency Treebank 2.0. The annotation is carried out on dependency trees (on the tectogrammatical layer). We describe the inter-annotator agreement measurement, classify and analyse the most common types of annotators’ disagreement. On two selected long texts, we asked the annotators ...
متن کاملCommunity annotation experiment for ground truth generation for the i2b2 medication challenge
OBJECTIVE Within the context of the Third i2b2 Workshop on Natural Language Processing Challenges for Clinical Records, the authors (also referred to as 'the i2b2 medication challenge team' or 'the i2b2 team' for short) organized a community annotation experiment. DESIGN For this experiment, the authors released annotation guidelines and a small set of annotated discharge summaries. They aske...
متن کاملLearning part-of-speech taggers with inter-annotator agreement loss
In natural language processing (NLP) annotation projects, we use inter-annotator agreement measures and annotation guidelines to ensure consistent annotations. However, annotation guidelines often make linguistically debatable and even somewhat arbitrary decisions, and interannotator agreement is often less than perfect. While annotation projects usually specify how to deal with linguistically ...
متن کاملAnnotating Named Entities in Consumer Health Questions
We describe a corpus of consumer health questions annotated with named entities. The corpus consists of 1548 de-identified questions about diseases and drugs, written in English. We defined 15 broad categories of biomedical named entities for annotation. A pilot annotation phase in which a small portion of the corpus was double-annotated by four annotators was followed by a main phase in which ...
متن کاملEvaluating Hierarchical Structure in Music Annotations
Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at diffe...
متن کامل